XAI Questionnaire Analyser¶

The following notebook has been created to analyse the results from the XAI questionnaire titled "Survey of the interpretability of decision trees", available at this Github repository. The goal of the questionnaire was to evaluate the interpretability of decision trees.

The notebook is divided into sections, and each of them takes its name from those present in the XAI questionnaire that will analyze this notebook. Each section will explain what information was provided to the survey participants and highlight the results obtained.

Suggestion on How to Run the Notebook (If run in a notebook environment)¶

We suggest you to use the "Run all" option of the notebook interpreter instead of running a cell at a time. The option is available in the menu bar in Jupyter environment in Run -> Run All Cells.


Table of Contents¶

  • Explanation
  • Comprehension Test Results
  • Questions Section Results
    • Question 1
    • Question 2
    • Question 3
    • Question 4
    • Question 5
  • Explanation 1 VS Explanation2
  • Participants' Demographic Analysis
    • Particants Gender
    • Participants Age
    • Participants Education Level
    • Participants English Level

Explanation ¶

The XAI questionnaire was presented to the evaluation participants with two representations of the same domain of interest. The two representations were in the form of a decision tree, and they differ in dimension/depth. The decision tree representations were:

3-Layers Decision Tree¶

The following model has a prediction accuracy of 0.927.

G cluster_legend node2 2022-05-17T14:39:18.568106 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5 2022-05-17T14:39:18.642467 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf3 2022-05-17T14:39:19.083323 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf3 leaf4 2022-05-17T14:39:19.107829 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf4 leaf6 2022-05-17T14:39:19.128991 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf6 leaf7 2022-05-17T14:39:19.153107 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf7 node1 2022-05-17T14:39:18.715433 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node5 node8 2022-05-17T14:39:18.965959 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9 2022-05-17T14:39:18.803015 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12 2022-05-17T14:39:18.893438 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf10 2022-05-17T14:39:19.289119 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 leaf11 2022-05-17T14:39:19.345224 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf11 leaf13 2022-05-17T14:39:19.386379 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf13 leaf14 2022-05-17T14:39:19.424961 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf14 node8->node9 node8->node12 node0 2022-05-17T14:39:19.038773 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node8 > legend 2022-05-17T14:39:18.471159 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

 5-Layers Decision Tree¶

The following model has a prediction accuracy of 0.978.

G cluster_legend node3 2022-05-17T14:36:47.721761 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6 2022-05-17T14:36:47.796232 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf4 2022-05-17T14:36:48.565317 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node3->leaf4 leaf5 2022-05-17T14:36:48.588666 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node3->leaf5 leaf7 2022-05-17T14:36:48.612017 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6->leaf7 leaf8 2022-05-17T14:36:48.634506 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6->leaf8 node2 2022-05-17T14:36:47.867383 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->node3 node2->node6 node9 2022-05-17T14:36:48.090521 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13 2022-05-17T14:36:47.938934 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf14 2022-05-17T14:36:48.703699 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13->leaf14 leaf15 2022-05-17T14:36:48.726714 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13->leaf15 node11 2022-05-17T14:36:48.012555 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node11->node13 leaf12 2022-05-17T14:36:48.680961 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node11->leaf12 node9->node11 leaf10 2022-05-17T14:36:48.657375 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 node1 2022-05-17T14:36:48.164356 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node9 node16 2022-05-17T14:36:48.444720 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17 2022-05-17T14:36:48.297731 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20 2022-05-17T14:36:48.369423 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf18 2022-05-17T14:36:48.749303 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17->leaf18 leaf19 2022-05-17T14:36:48.771826 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17->leaf19 leaf21 2022-05-17T14:36:48.794836 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20->leaf21 leaf22 2022-05-17T14:36:48.817358 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20->leaf22 node16->node17 node16->node20 node0 2022-05-17T14:36:48.520625 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node16 > legend 2022-05-17T14:36:47.634923 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

Important information regarding the notation¶

In the following analysis, the 3-layers decision tree will be referred to as "explanation 1", while the 5-layers decision tree will be referred to as "explanation 2".


Comprehension Test Results ¶

The comprehension test section aims to test the notions acquired by the participants of the questionnaire and to verify the goodness of the participants' mental model. The test is constructed around the sample with the following features.

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
12.42 2.55 2.27 22.0 90.0 1.68 1.84 0.66 1.42 2.7 0.86 3.3 315.0

The questionnaire participants have to answer two questions based on the visual explanation that represents the model in the form of a decision tree. The visual explanation (3-Layers Decision Tree) provided is presented below.

G cluster_legend node2 2022-05-17T14:39:18.568106 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5 2022-05-17T14:39:18.642467 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf3 2022-05-17T14:39:19.083323 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf3 leaf4 2022-05-17T14:39:19.107829 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf4 leaf6 2022-05-17T14:39:19.128991 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf6 leaf7 2022-05-17T14:39:19.153107 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf7 node1 2022-05-17T14:39:18.715433 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node5 node8 2022-05-17T14:39:18.965959 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9 2022-05-17T14:39:18.803015 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12 2022-05-17T14:39:18.893438 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf10 2022-05-17T14:39:19.289119 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 leaf11 2022-05-17T14:39:19.345224 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf11 leaf13 2022-05-17T14:39:19.386379 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf13 leaf14 2022-05-17T14:39:19.424961 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf14 node8->node9 node8->node12 node0 2022-05-17T14:39:19.038773 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node8 > legend 2022-05-17T14:39:18.471159 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

The two questions posed to the participants of the questionnaire are the following:

  • Q1: Which class correspond the wine with the following features? Correct Answer: Class 2
  • Q2: Which of the following features/attributes did you consider for the classification? Correct Answer: Proline, OD280/OD315, and Flavanoids

The Comprehension Test section results are:

  • The Q1 accuracy is 0.404
  • The Q2 accuracy is 0.246

Questions Section Results ¶

The "Questions" section of this XAI questionnaire was structured to test the "Transparency" aspect of the explanation provided to the evaluation participants. To assess the "Transparency" aspect of the provided explanations, this section was structured to satisfy the requirements of performing a 'Forward Simulation" task. In Forward Simulation tasks, participants are provided with an input and an explanation to ask them to predict the system's output. The section was composed of five questions, and in each of them, the participants had to answer only one question composed of three choices (the three classes of the dataset). The questions and the explanation in each of them were presented to the participants randomly.

This section will show the prediction accuracy of the participants and the time taken by the participants to answer the questions by taking into account the explanation they received during the evaluation.

NB:

If the data for one of the two explanations would miss, the graph associated with it will not be displayed.

Question 1 ¶

The sample used in question 1 is from the row 0 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
14.23 1.71 2.43 15.6 127.0 2.8 3.06 0.28 2.29 5.64 1.04 3.92 1065.0

The correct classificiation of the sample was "Class 1". Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0.407
  • Time Taken to answer the question with Explanation 1: Mean = 16.727, Median = 18.521, Standard Deviation: 5.884
  • Prediction Accuracy on Explanation 2: 0.567
  • Time Taken to answer the question with Explanation 2: Mean = 15.203, Median = 13.61, Standard Deviation: 6.545

Question 2 ¶

The sample used in question 2 is from the row 58 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
12.37 1.13 2.16 19.0 87.0 3.5 3.1 0.19 1.87 4.45 1.22 2.87 420.0

The correct classificiation of the sample was "Class 2". Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0.387
  • Time Taken to answer the question with Explanation 1: Mean = 13.454, Median = 10.787, Standard Deviation: 5.639
  • Prediction Accuracy on Explanation 2: 0.462
  • Time Taken to answer the question with Explanation 2: Mean = 17.244, Median = 16.908, Standard Deviation: 6.519

Question 3 ¶

The sample used in question 3 is from the row 156 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.71 5.65 2.45 20.5 95.0 1.68 0.61 0.52 1.06 7.7 0.64 1.74 740.0

The correct classificiation of the sample was "Class 3". Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0.5
  • Time Taken to answer the question with Explanation 1: Mean = 14.467, Median = 13.886, Standard Deviation: 4.998
  • Prediction Accuracy on Explanation 2: 0.303
  • Time Taken to answer the question with Explanation 2: Mean = 15.897, Median = 15.399, Standard Deviation: 7.125

Question 4 ¶

The sample used in question 4 is from the row 164 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.74 1.67 2.25 16.4 118.0 2.6 2.9 0.21 1.62 5.85 0.92 3.2 1060.0

The correct classificiation of the sample was "Class 1". Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0.286
  • Time Taken to answer the question with Explanation 1: Mean = 15.213, Median = 14.125, Standard Deviation: 5.99
  • Prediction Accuracy on Explanation 2: 0.414
  • Time Taken to answer the question with Explanation 2: Mean = 18.08, Median = 18.636, Standard Deviation: 5.201

Question 5 ¶

The sample used in question 5 is from the row 177 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.4 3.91 2.48 23.0 102.0 1.8 0.75 0.43 1.41 7.3 0.7 1.56 750.0

The correct classificiation of the sample was "Class 3". Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0.469
  • Time Taken to answer the question with Explanation 1: Mean = 15.384, Median = 15.444, Standard Deviation: 6.163
  • Prediction Accuracy on Explanation 2: 0.32
  • Time Taken to answer the question with Explanation 2: Mean = 14.618, Median = 15.225, Standard Deviation: 5.166

Explanation 1 VS Explanation 2 ¶

In this section, the notebook will highlight the users' prediction accuracy and the time taken to answer the questions considering all the questions with explanations 1 and 2.

  • Prediction Accuracy on Explanation 1: 0.408
  • Time Taken to answer the questions with Explanation 1: Mean = 15, Median = 15.046, Standard Deviation: 5.943
  • Prediction Accuracy on Explanation 2: 0.413
  • Time Taken to answer the questions with Explanation 2: Mean = 15.736, Median = 15.71, Standard Deviation: 5.797

Participants' Demographic Analysis ¶

The questionnaire was completed by 62 participants. Below you will find some information about them.

Participants Gender ¶

Participants of the evaluation could select one of the following choices:

  • Male
  • Female
  • Other
  • Prefer not to say

Participants Age ¶

Participants of the evaluation could select one of the following choices:

  • 18-20
  • 21-29
  • 30-39
  • 40-49
  • 50-59
  • 60 or older

Participants Education Level ¶

Participants of the evaluation could select one of the following choices:

  • Less than high school degree
  • High school degree or equivalent
  • Undergraduate
  • Graduate

Participants English Level ¶

Participants of the evaluation could select one of the following choices:

  • Beginner (A1)
  • Elementary (A2)
  • Lower Intermidiate (B1)
  • Upper Intermidiate (B2)
  • Advanced (C1)
  • Proficient (C2)